Learning with Biased Complementary Labels
نویسندگان
چکیده
In this paper we study the classi cation problem in which we have access to easily obtainable surrogate for the true labels, namely complementary labels, which specify classes that observations do not belong to. For example, if one is familiar with monkeys but not meerkats, a meerkat is easily identi ed as not a monkey, so “monkey” is annotated to the meerkat as a complementary label. Speci cally, let Y and Ȳ be the true and complementary labels, respectively. We rst model the annotation of complementary labels via the transition probabilities P (Ȳ = i|Y = j), i 6= j ∈ {1, · · · , c}, where c is the number of classes. All the previous methods implicitly assume that the transition probabilities P (Ȳ = i|Y = j) are identical, which is far from true in practice because humans are biased toward their own experience. For example, if a person is more familiar with monkey than prairie dog when providing complementary labels for meerkats, he/she is more likely to employ “monkey” as a complementary label. We therefore reason that the transition probabilities will be di erent. In this paper, we address three fundamental problems raised by learning with biased complementary labels. (1) How to estimate the transition probabilities? (2) How to modify the traditional loss functions and extend standard deep neural network classi ers to learn with biased complementary labels? (3) Does the classi er learned from examples with complementary labels by our proposed method converge to the optimal one learned from examples with true labels? Comprehensive experiments on MNIST, CIFAR10, CIFAR100, and Tiny ImageNet empirically validate the superiority of the proposed method to the current state-of-the-art methods with accuracy gains of over 10%. ∗UBTECH Sydney AI Centre and the School of Information Technologies in the Faculty Engineering and Information Technologies at The University of Sydney, NSW, 2006, Australia, [email protected], [email protected], [email protected] †Department of Biomedical Informatics, University of Pittsburgh,[email protected] 1 ar X iv :1 71 1. 09 53 5v 2 [ st at .M L ] 3 D ec 2 01 7 Not “monkey” Not “prairie dog” Not “meerkat” Complementary Label True Label Meerkat Monkey Prairie Dog Figure 1: A comparison between true labels (top) and complementary labels (bottom).
منابع مشابه
Social Context Effects on the Impact of Category Labels
We explore whether social context affects how labels (relative to other features) affect category learning. We taught 104 participants four novel categories using a feature inference task. In a between-participants design, we manipulated: 1) the social context of the task (social context vs. on the computer); and 2) which dimension of the category members could be used to perfectly predict the ...
متن کاملLearning from Complementary Labels
Collecting labeled data is costly and thus is a critical bottleneck in real-world classification tasks. To mitigate the problem, we consider a complementary label, which specifies a class that a pattern does not belong to. Collecting complementary labels would be less laborious than ordinary labels since users do not have to carefully choose the correct class from many candidate classes. Howeve...
متن کاملHigh-Dimensional Unsupervised Active Learning Method
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...
متن کاملBuilding a Biased Least Squares Support Vector Machine Classifier for Positive and Unlabeled Learning
Learning from positive and unlabeled examples (PU learning) is a special case of semi-supervised binary classification. The key feature of PU learning is that there is no labeled negative training data, which makes the traditional classification techniques inapplicable. Similar to the idea of Biased-SVM which is one of the most famous classifier, a biased least squares support vector machine cl...
متن کاملHierarchical Mixtures of GLMs for Combining Multiple Ground Truths
In real-world machine learning problems it is often the case that the gold-standard for a particular learning problem is not accurately reflected by any one particular data set. For example, when modeling the landing-page quality associated with a search result, labels from human evaluators are often biased towards “brandname” sites, whereas labels derived from conversions can potentially confo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1711.09535 شماره
صفحات -
تاریخ انتشار 2017